Phenotyping COVID-19 respiratory failure in spontaneously breathing patients with AI on lung CT-scan

Background Automated analysis of lung computed tomography (CT) scans may help characterize subphenotypes of acute respiratory illness. We integrated lung CT features measured via deep learning with clinical and laboratory data in spontaneously breathing subjects to enhance the identification of COVID-19 subphenotypes. Methods This is a multicenter observational cohort study in spontaneously breathing patients with COVID-19 respiratory failure exposed to early lung CT within 7 days of admission. We explored lung CT images using deep learning approaches to quantitative and qualitative analyses; latent class analysis (LCA) by using clinical, laboratory and lung CT variables; regional differences between subphenotypes following 3D spatial trajectories. Results Complete datasets were available in 559 patients. LCA identified two subphenotypes (subphenotype 1 and 2). As compared with subphenotype 2 (n = 403), subphenotype 1 patients (n = 156) were older, had higher inflammatory biomarkers, and were more hypoxemic. Lungs in subphenotype 1 had a higher density gravitational gradient with a greater proportion of consolidated lungs as compared with subphenotype 2. In contrast, subphenotype 2 had a higher density submantellar–hilar gradient with a greater proportion of ground glass opacities as compared with subphenotype 1. Subphenotype 1 showed higher prevalence of comorbidities associated with endothelial dysfunction and higher 90-day mortality than subphenotype 2, even after adjustment for clinically meaningful variables. Conclusions Integrating lung-CT data in a LCA allowed us to identify two subphenotypes of COVID-19, with different clinical trajectories. These exploratory findings suggest a role of automated imaging characterization guided by machine learning in subphenotyping patients with respiratory failure. Trial registration: ClinicalTrials.gov Identifier: NCT04395482. Registration date: 19/05/2020. Supplementary Information The online version contains supplementary material available at 10.1186/s13054-024-05046-3.


Introduction
Categorizing heterogeneous populations of critically ill patients into distinct groups has recently gained prominence because of its potential to predict outcomes.Such an approach is applicable to disparate conditions such as sepsis [1,2], acute kidney injury [3], and acute respiratory distress syndrome (ARDS) [4].Here, latent class analysis (LCA) showed promise in the identification of ARDS sub-phenotypes with different biologic features, treatment responses, and clinical trajectories [5][6][7][8].
Computed tomography (CT) may contribute to better stratification of the severity of acute pulmonary illness through topographic description of lung morphology.In ARDS, patient categorization by diffuse rather than focal infiltrates on lung CT was associated with higher mortality and worse respiratory mechanics [9].However, the recent LIVE STUDY [10] was unable to show that an imaging-guided strategy of mechanical ventilation improved survival.Patient miscategorization due to heterogeneous protocols of image acquisition and subjective analysis may explain this result.
Rapid technological improvements in image processing and data modelling enable the objective characterization of lung morphology patterns for prognostic purposes [11].Machine learning has been recently proposed to quantitatively and qualitatively evaluate large datasets of CT images.In particular, automated segmentation (i.e.separation of pulmonary from non-pulmonary tissue) by deep neural networks allows high-throughput image processing in ways that were previously impossible [12,13].The potential to use automatically processed CT data to predict outcomes, however, is still unexplored in acute respiratory illness.
Because of the success of LCA, using this statistical approach to integrate CT data with clinical and biological variables may enhance patient stratification in terms of severity and response to treatment.We hypothesized that, in a large population of patients with acute respiratory illness, LCA incorporating lung CT data, explored by deep neural network, may improve characterization of pathophysiology and offer a tool to triage patients, correlating radiological patterns to disease progression and treatment response.We therefore tested this hypothesis in spontaneously breathing COVID-19 patients who, being hospitalized, had a high risk of evolving to acute respiratory failure and death.The objectives of our study were to: (1) identify COVID-19 subphenotypes by incorporating pattern recognition of lung CT scans in a LCA; (2) characterize regional quantitative and qualitative lung CT data in each COVID-19 subphenotype by deep learning analysis; and (3) explore whether the severity stratification by LCA of clinical, laboratory and CT data may have an independent association with mortality.

Ethical consideration and data acquisition
The study was performed under the Declaration of Helsinki and in agreement with the Italian good clinical practice recommendations (D.M. Sanità del 15/07/97 e s.m.i.) and with the applied healthcare hospital protocols.No change of current clinical practice or clinical protocols in use were taken in place in the enrolled study population.Considering the retrospective nature of the proposed study, we did not anticipate risks nor benefits that might be added to the patients.Moreover, in the presence of technical difficulties related to the emergency health context to obtain an informed consent from patients in that period of pandemic, informed consent was waived.For this reason and for the great public interest of the project, the research was conducted in the context of the authorizations guaranteed by Article 89 of the GDPR EU Regulation 2016/679, which guarantees the treatment for purposes of public interest, scientific or historical research or for statistical purposes of health data.Personal data were handled in compliance with the European Regulation on the Protection of Personal Data (GDPR), the Legislative Decree 196/2003 and subsequent amendments and additions, and any other Italian law applicable to the protection of personal data (henceforth referred to as the "applicable data protection law").Data were collected in a pseudo-anonymous way through paper case report forms, digitalized on a University of Milano-Bicocca Institutional Google drive account and analyzed by the scientific coordinator of the project (E.R.).Favorable judgment for the execution of the study was obtained before data acquisition from the local institutional review board of the coordinating center Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy (Approval date: 24/04/2020; number 3375) and from the local institutional review board of each enrolled center (Policlinico San Marco, Gruppo Ospedaliero San Donato, Zingonia, Bergamo, Italy; Ospedale Infermi, Rimini, Italy; Ospedale Papa Giovanni XXIII, Bergamo, Italy; Ospedale Alessandro Manzoni, Lecco, Italy; Arcispedale Sant' Anna, Ferrara, Italy; Ospedale Santa Maria delle Stelle, Melzo, Italy; Istituto Sicureza Sociale, Repubblica di San Marino).
Baseline characteristics (age, sex, body mass index, comorbidities) and clinical illness severity (Sequential Organ Failure Assessment (SOFA) and pH) were collected, together with laboratory biomarkers, blood gas analysis, respiratory assistance, and hemodynamic data at hospital admission.Lung CT scans acquired for clinical purposes within the first week since hospital admission were obtained.Data on drug treatments and complications during hospital admissions, outcomes including length of stay (in ICU and in hospital), use of non-invasive respiratory support, mechanical ventilation-free days, limitation of life sustaining measures, ICU mortality, and hospital mortality were recorded.

Inclusion and exclusion criteria
Inclusion criteria: For the current analysis we included patients who were admitted to the Emergency Department with a clinical diagnosis of COVID-19 respiratory failure.

Chest CT quantification
The lung CT scan images were collected and anonymized and then sent by the University of Milano-Bicocca Institutional Google drive account to the University of Pennsylvania, Department of Anesthesiology and Critical Care and the Department of Radiology (M.C., Y.X., S.G., J.H.) in a de-identified format for advanced quantitative analysis taking advantage of artificial intelligence using deep learning algorithms [14].CT images were segmented using an established convolutional neural network (CNN) previously validated [12].The masks included vasculature and airways inside the lungs, but excluded major airways (e.g., trachea) and vessels outside the lung lobes in the hilum area.Therefore, the role of CNN allowed to provide automated segmentations of each lung into 15 regions-of-interest (ROI) for the subsequent analysis as follows: • whole lung; • five individual lobes (left upper lobe (LUL), left lower lobe (LLL), right upper lobe (RUL), right middle lobe (RML), and right lower lobe (RLL)); • the analysis by the 3 axes of space (i.e.X, Y and Z) that were three equally sized (by pixel counts) including horizontal ventral-to-dorsal regions (Ventral; Dorso-Ventral; Dorsal), vertical apical-to-basal regions (Apical; Basal-Apical; Basal), and three con-centric submantellar-to-hilar regions (Submantellar; Central; Hilar) [15].After segmentation, whole-lung and lobar lung masks were inspected by a trained investigator (Y.X.), and manually adjusted using ITKsnap software [16].For each ROI, six parameters were analyzed [17,18] In sum, a total of ninety lung features were calculated for each patient, consisting of six parameters for each of fifteen regions.We calculated the gravitational (ventrodorsal), the apical-basal, and the submantellar-hilar lung density gradients by linear fitting density, percentage of GGO, and percentage of consolidation in three corresponding regions.The slope of this linear fit was compared between latent classes.

Latent class analysis
Latent class analysis (LCA) is a well-established statistical technique that employs mixture modeling to identify the most appropriate model for a data set, based on the premise that the data encompasses several unobserved groups or classes.Unlike traditional regression analyses, which aim to delineate the relationship between predefined independent variables and a specified outcome, LCA identifies potential subgroups within the data based on combinations of baseline variables, without necessarily linking them to an outcome.
We implemented LCA following the methodological guidelines to LCA as described by Sinha et al. [19], by amalgamating mixed clinical, laboratory, and CT data.Decision on the variables included (n = 15) in the LCA model was based on clinical illness severity at hospital admission and on previously published work [8,20].High correlation was explored, and the correlation matrix was plotted in online supplemental Fig. 1.The absolute value of correlations between five pairs was greater than 0.7 [(HCO 3 − , PaCO 2 ), (Lung gas volume, GGO), (Lung gas volume, Mean lung HU), (GGO, Mean lung HU), (Consolidation, Mean lung HU)], indicating strong correlations.Therefore, mean lung HU, GGO (proportion of ground glass opacities) and HCO 3 − were removed to avoid high correlation.From 559 samples, the final 12 variables (i.e.PaO 2 /FiO 2 , Lung gas volume, Temperature, PaCO 2 , Total Bilirubin, Platelets, Age, Lung mass, Creatinine, hs-CRP, WBC, Consolidation fraction) were included in the LCA model with different numbers of classes and specifications of covariance matrix structures.Depending on the model configuration, the identified classes can show different class-specific covariances [21].We explored three settings of covariance-variance structure as shown in supplemental Fig. 2.Under the assumption of freed variance and covariances, we compared the BIC, averaged uncertainty and entropy across entire samples among 2, 3, 4, 5 and 6 classes (supplemental Table 1).The optimal model that yielded the smallest BIC and uncertainty was the one with two-classes.In addition, entropy was computed as a measure of effective separation.However, it is not a reliable sole criterion for choosing the best model because a model that overfits may also exhibit high entropy [19].

Statistical analysis
Continuous data are reported as mean ± standard deviation (SD) or median and interquartile range (IQR).Categorical variables are expressed as proportions (frequency).Differences between the 2 clusters were assessed by unpaired Student's T-test or U Mann-Whitney test as appropriate.Differences between categorical data were assessed by using Pearson's chi-square test or Fisher's exact test.Correlation between quantitative lung computed tomography data and gas exchange was assessed by linear regression analysis and Pearson correlation coefficient was reported.Differences in 90-day survival across subphenotypes was explored by Kaplan-Meier approach.Univariable and multivariable Cox proportional regression models were performed to explore the independent association of subphenotypes with 90-day mortality by including clinically meaningful covariates.Mortality risk was reported by hazard ratio with 95% confidence interval.Clinically meaningful covariates were decided a priori to adjust the multivariable models as follows: sex, the presence of any comorbidities, the decision of limitation of life sustaining measures.Adjusted models were ranked by their Akaike information criterion (AIC) and their Bayesian information criterion (BIC).AIC and BIC address both goodness-of-fit and simplicity of a model.Since we compared models with the same number of independent variables for the same set of patients, the lowest AIC and BIC represented the best fit model.Statistical significance was considered with a p < 0.05 (two-tailed).Further, we investigated LCA modeling by only including clinical and laboratory data (i.

Sample size
We aimed to collect data from 500 patients at least, as this is considered an adequate sample size to conduct LCA [19].
Comprehensive information on methods is reported in the Supplemental material.
This study followed The Strengthening the Reporting of Observational studies in Epidemiology (STROBE) reporting guideline checklist.

Patient population and stratification by LCA
Clinical, laboratory and CT data were collected between February and April 2020, during the peak of the first Italian wave of the COVID-19 pandemic.
Out of 853 patients, 810 fulfilled study inclusion criteria and had a diagnosis of COVID-19 respiratory failure at the ED admission.Five-hundred and fifty-nine (559) patients had complete records including clinical, laboratory and CT variables to build the LCA model (online supplemental Fig. 3).Patients' characteristics are reported in Table 1.
We identified 2 different clusters of patients that we labeled as subphenotype 1 and subphenotype 2 subphenotypes (Fig. 1).Differences in LCA variables are reported in online supplemental Table 2.The subphenotype 1 was radiologically characterized by higher lung weight, lower lung gas volume, and higher proportion of consolidation.Oxygenation was worse in the subphenotype 1 as compared with the subphenotype 2, with no difference in pCO 2 levels.Inflammatory biomarkers such as white blood cells (WBC), high sensitivity C-reactive protein, and platelets were higher in the subphenotype 1. Patients in the subphenotype 1 were older and had higher creatinine levels as compared with the subphenotype 2. Comorbidities associated with endothelial dysfunction (e.g.systemic hypertension, diabetes, chronic kidney disease and congestive heart failure) were more prominent in subphenotype 1. Consistently, more endothelial activation was observed by higher levels of D-Dimers in subphenotype 1. Low-flow FiO 2 requirement was higher in the subphenotype 1 at hospital admission (Table 1).

Quantitative and qualitative CT analysis by automated segmentation using deep neural network algorithm
Exemplary images of 10 patients with subphenotype 1 (Fig. 2 upper panel) and subphenotype 2 (Fig. 2 2 bottom panel.We provided a detailed description of regional quantitative and qualitative CT differences across the 2 latent classes in Table 2.At a regional level, the gravitational and apical-basal density gradients were significantly higher in the subphenotype 1 as compared with the subphenotype 2 (Fig. 3A,  B).This was explained by a higher change in consolidation (Fig. 3G, H) and a lower change in ground glass opacities (Fig. 3D), respectively.
In contrast, the submantellar-hilar density gradient was higher in the subphenotype 2 as compared with the subphenotype 1 (Fig. 3C).This was driven by a higher change in ground glass opacities in subphenotype 2 versus subphenotype 1 (Fig. 3F).
Lung gas volume mildly correlated with oxygenation in patients with subphenotype 1, while mean lung density and lung weight correlated with oxygenation in both LCA clusters (Fig. 4A-C).None of the imaging features of the 2 subphenotypes correlated with PaCO 2 levels (Fig. 4D-F).

Pharmacological treatments and complications
During the hospital stay, the subphenotype 1 received a higher proportion of antibiotics, hydroxycloroquine and cloroquine, anticoagulation and a higher trend of steroids.
A higher rate of complications were present in subphenotype 1 as compared with subphenotype 2. (Table 1).

Outcome analysis
The subphenotype 1 had a higher hospital mortality rate (58% vs. 22%, p < 0.001) and a longer length of stay (21 (15) vs. 13 (16), p < 0.001) in survivors-as compared with the subphenotype 2. No differences between ICU outcomes were observed.Limitation of life sustaining measure was more frequent in the subphenotype 1 (Table 1).
In univariable Cox proportional regression modeling, subphenotype 1 was significantly associated with 90-day mortality (HR 3.49, 95% CI [2.60-4.69],p < 0.001).This was confirmed after the adjustments with clinically meaningful variables including sex, the presence of any comorbidities and limitation of life sustaining measures.The highest prediction models included the one adjusted for all the tested variables (lowest AIC = 1797) and the model adjusted for both limitation of life sustaining measures and the presence of any comorbidities (lowest BIC = 1813) (Table 3).
Further, we investigated whether the association between subphenotypes obtained by LCA only including clinical and laboratory data or only including CT derived features (data not shown) were differently associated with 90-day mortality.The subphenotype 1 obtained by LCA including all clinical, laboratory and CT derived variables was associated with the highest 90-day mortality risk (n = 559; subphenotype 1 versus subphenotype 2; HR 3.46; 95% CI 2.58-4.64;p < 0.001) and highest goodness of fit (AIC, 2153; BIC, 2157) as compared to LCA modeling only including clinical and laboratory data (n = 559; subphenotype 1 vs. subphenotype 2; HR 3.23; 95% CI 2.40-4.35;p < 0.001; AIC, 2164; BIC, 2169) or only including CT derived features (n = 559; subphenotype 1 Fig. 1 Differences in standardized values of each continuous variable by LCA derived subphenotypes.The variables are sorted based on the degree of separation between the subphenotypes, from maximum positive separation on the left (i.e., subphenotype 2 higher than subphenotype 1) to maximum negative separation on the right (i.e., subphenotype 2 lower than subphenotype 1).The y-axis describes the standardized variable values, in which all means are scaled to zero and standard deviations (SDs) to one.A value of + 1 for the investigated standardized variable means that the mean value for a given subphenotype was one SD higher than the mean value in the cohort as a whole.Mean values are joined by lines to facilitate displaying subphenotype profiles.Variables included to investigate LCA derived subphenotypes are highlighted in green (CT-derived features) and red (clinical and laboratory parameters).WBC white blood cells, CRP C-reactive protein, PaCO 2 arterial carbon dioxide partial pressure, PaO 2 /FiO 2 ratio of arterial oxygen partial pressure to fractional inspired oxygen   Furthermore, as age is an important predictor of mortality and may influence clinical decision making, we explored whether retaining or removing age from the LCA may help to improve the outcome prediction in our study population.Presence (n = 559; subphenotype 1 versus subphenotype 2; HR 3.46; 95% CI 2.58-4.63;p < 0.001; AIC = 2153; BIC = 2157) or absence (n = 559; subphenotype 1 versus subphenotype 2; HR 3.54; 95% CI 2.64-4.75;p < 0.001; AIC = 2152; BIC = 2156) of age within the LCA modeling did not make difference in the prediction of 90-day mortality, as shown by AIC and BIC values.This confirmed the goodness of our original LCA modeling including age in separating latent classes independently from outcomes.

Differences between patients included and excluded from the LCA model
A comprehensive description of differences between demographics, clinical, CT and outcome characteristics between patients with complete and incomplete data was presented in online supplemental Tables 3-5.

Discussion
In this retrospective multicenter observational study performed during the peak of the COVID-19 pandemic in Italy, we observed the following major findings in spontaneously breathing patients during their early hospital admission: • LCA separated two different subphenotypes using clinical, laboratory and chest CT data analyzed by AI, that were characterized by different levels of systemic inflammatory biomarkers, oxygenation, and lung injury distribution; • using automated segmentation with deep learning analysis, we observed higher mean lung density and lower gas content in the lungs of patients within the subphenotype 1, larger proportion of consolidation and ground glass attenuation as compared with the subphenotype 2; • the 2 subphenotypes showed different spatial heterogeneity, with a higher gravitational and apicalbasal density gradient mainly led by consolidation in subphenotype 1, while a higher submantellarhilar density gradient mainly led by ground-glass opacities in subphenotype 2; • the subphenotype 1 had higher rate of hospital mortality, confirmed in multivariable models adjusted for clinically meaningful variables.
The SARS-CoV-2 pandemic nearly overwhelmed the Italian healthcare system in the first half of 2020, imposing a dramatic burden on intensive care units [22].Nevertheless, this surge allowed us to collect a large amount of data on this specific respiratory condition [23,24].We therefore decided to perform this exploratory study to test the hypothesis that integrating Population enrichment by ARDS phenotyping has been proposed to reduce between-subject heterogeneity paving the road to precision medicine [25].Within this context, the use of LCA using clinical and biological data identified an hyperinflammatory cluster of ARDS that was associated with a high mortality rate [20,26] and differential treatment responses [8].In contrast, the efficacy of this approach in COVID-19 respiratory failure is uncertain.Several prognostic models have been proposed for COVID-19 but did not show accurate prediction of clinical deterioration or mortality [27,28].Sinha et al. reported that the role of inflammation may be less impactful on outcomes than in classical ARDS [29].Bos et al. did not report the presence of consistent respiratory subphenotypes in COVID-19 patients [30].In contrast, Ranjeva et al. observed 2 distinct subphenotypes of COVID-19 respiratory failure with substantial differences in biochemical profiles and coagulopathy [31].Furthermore, when using only CT data to stratify COVID-19 respiratory failure, Robba et al. reported that specific chest CT-patterns may help to optimize the ventilator strategy [32].
Filippini et al. previously applied a LCA analysis to lung-CT and ventilatory data in a small sample of mechanically ventilated patients to identify lung recruitability [33].In contrast, we studied only patients who were captured early in their clinical course, shortly after hospital admission and while breathing spontaneously.In such population, we explored LCA by combining clinical and biological data with imaging metrics.We identified a subphenotype 1, associated with more heterogeneous injury on pulmonary CT and with the presence of higher levels of systemic inflammatory biomarkers.This subphenotype 1 had worse oxygenation, which was related to metrics of radiological severity.Moreover, our data suggest the presence of more severe vascular endothelial dysfunction in the subphenotype 1, because of the higher frequency of vascular comorbidities (e.g.diabetes, systemic hypertension, chronic renal failure and congestive cardiac failure) [34].Higher D-Dimer levels also support endothelial dysfunction in this subgroup of patients, which is a known proxy of pulmonary hypoperfusion in COVID-19 patients [35] and may have contributed to worse gas exchange.Notably, unlike in studies of ARDS phenotypes, plasma bicarbonate did not differ between the 2 subphenotypes [5,20,29].
The use of machine learning techniques enables processing a large-volume image dataset, using a validated method of radiological processing [12,13].This quantitative lung CT analysis informed us on mean lung density distributions in both subphenotypes.We observed significant differences in mean lung density distributions, although the amount of poorly aerated lung tissue was relatively low in the subphenotype 2 and the majority of segmented lung was contained within the normal range of aeration [17].Despite these subtle alterations in lung aeration-which may also be overemphasized by the presence of spontaneous breathing in all patients-all evaluated lung regions in the subphenotype 1 were quantitatively denser and heavier and the whole lung gas volume was lower, as compared with the subphenotype 2.
Furthermore, the subphenotype 1 showed a higher quantitative gravitational and apical-basal density gradient, while the subphenotype 2 showed a higher submantellarhilar gradient.These findings provide a morphological description of the 2 subphenotypes by adding a morphological quantification to LCA characterization of clinical severity.
The identification of two different clusters was highly prognostic, as the subphenotypes had different association with hospital mortality.We adjusted the model for clinically meaningful variables known to impact on mortality in patients with respiratory failure: sex [36], comorbidities, and limitation of life sustaining measures [37].After adjustment, subphenotype 1 remained a robust predictor of death with an OR of 2.86 as compared with the subphenotype 2. These findings confirm a correlation with mortality of subphenotypes of respiratory failure identified with clinical and biological data [20] and with CT qualitative data [9].This analysis suggests how the process of interaction between medical statistics (LCA) and artificial intelligence (deep learning analyses on automated segmentation on lung CT images) may be a robust interactive ground to build on and strengthen medical evidence [38].
Because of their early hospital admission, our population included patients that were clinically evaluated during low-flow oxygen administration or ambient air.One out of ten of these patients was admitted to ICU.An open question is whether the role of a specific early non-invasive respiratory support or the need of invasive mechanical ventilation may act differently as outcome modifier in the 2 subphenotypes of spontaneously breathing COVID-19 patients.ICU admission was higher in the subphenotype 1, but no difference in ICU mortality was reported between the 2 subphenotypes, suggesting a similar mortality risk when the patients were admitted to the ICU and underwent mechanical ventilation.
Our study has several strengths.First, this is the first study that analyzes a high number of CT studies with a validated machine learning analysis method in spontaneously breathing COVID-19 patients.Second, we emphasized that we built a latent class model in which we add imaging metrics to clinical and laboratory data to provide a characterization of the morphological lung injury patterns of the identified subphenotypes.Third, this is a multicenter clinical trial in which 7 Italian and 1 center from the San Marino Republic obtained clinical and lung imaging data in a specific subpopulation of COVID-19 patients enrolled in the middle of a global worldwide pandemic.Fourth, patients were enrolled in the same pandemic wave, limiting variation linked to genetic SARS-CoV-2 variants change, and potential treatment/preventative measures identification (e.g.steroids, vaccines).
This study has some limitations.First, this is a retrospective observational cohort study of data collected in the middle of a global pandemic, so we could not perform an external validation.However, data were collected from different centers during the first pandemic European wave.Second, we had missing data forcing us to reduce the population size from 810 to 559 patients to build LCA.Consequently, we reported a comprehensive description of differences between the cohort of patients with complete and uncomplete data for LCA.Third, we had limited data on BMI and D-Dimers because of the pandemic surge.However, although in a reduced sample size, BMI did not differ between the subphenotypes, while D-Dimer levels were significantly higher in the subphenotype 1, suggesting a higher proportion of endothelial dysfunction-correlated comorbidities.Furthermore, lung CT data did not include angiograms [35] or CT techniques exploring gas:blood volume mismatch that may serve as proxies of impaired lung perfusion [39,40].However, as previously mentioned the higher levels of D-Dimer in the subphenotype 1 may suggest a higher probability of lung malperfusion [35] as compared with the subphenotype 2. Fourth, the biomarkers included in these analyses were limited to those that were measured in an emergency setting but were in line with previous work [5,8,31].Consideration of these biomarkers, and/ or of alternative proteomic, genomic or metabolomic markers may recognize these biomarkers as important subphenotye classifiers.
In conclusion, during the first pandemic wave in a western country, we identified two different subphenotypes by LCA on clinical, biological and lung-CT data in COVID-19 patients who were studied while spontaneously breathing and shortly after admission.The subphenotypes were differently associated with hospital mortality and were robust to adjustment for clinically meaningful variables.These findings suggest a potential role of lung imaging in subphenotyping patients with acute respiratory failure, provided that images are objectively analyzed, a task now made possible by machine learning.
e. PaO 2 /FiO 2 , Temperature, PaCO 2 , Total Bilirubin, Platelets, Age, Creatinine, hs-CRP, WBC) or only including CT derived features (i.e.Lung gas volume, Lung mass, Consolidation fraction) to assess whether the most complete LCA model including overall mixed clinical, laboratory, and CT data showed a better association with 90-day mortality and the highest goodness of model fitting.Statistical analysis was performed by SPSS software v28 (IBM Corp., Armonk, NY, USA), R-project (Version 4.3.2) and Stata/MP 17.0 (Copyright 1985-2021 Stata-Corp LLC (College Station, TX, 77845, USA).

Fig. 3
Fig. 3 Box and whisker plots of mean lung density, ground glass opacities, and consolidation distribution in subphenotype 1 and subphenotype 2 across 3 different gradients of lung injury.Ventro-dorsal gradient (panel A, D and G); apical-basal gradient (panel B, E and H); and submantellarhilar gradient (panel C, F and I)

Fig. 4 Fig. 5
Fig. 4 Correlation between CT derived parameters and gas exchange.Panel A: correlation between mean lung density and PaO 2 /FiO 2 ; panel B: correlation between lung gas volume and PaO 2 /FiO 2 ; panel C: correlation between lung weight and PaO 2 /FiO 2 ; panel D: correlation between mean lung density and PaCO 2 ; panel E: correlation between lung gas volume and PaCO 2 ; panel F: correlation between lung weight and PaCO 2 .HU hounsfield units, PaO 2 /FiO 2 ratio of arterial oxygen partial pressure to fractional inspired oxygen, PaCO 2 arterial carbon dioxide partial pressure

Table 1
middle Baseline characteristics, comorbidities, clinical illness severity, respiratory support at hospital admission; treatments and outcomes of patients stratified by subphenotypes

Table 1
(continued)Differences between the 2 subphenotypes were assessed and reported in p value column.Continuous data are expressed as mean (standard deviation), categorical variables as count (relative frequency %).In the presence of missing data, sample size was reported ACE angiotensin converting enzyme, ARB angiotensin receptor blockers, BMI body mass index, COPD chronic obstructive pulmonary disease, cPAP continuous positive airway pressure, ICU intensive care unit, LFO low-flow oxygen, LMWH low molecular weight heparin, LOS length of stay, OSAS obstructive sleep apnea syndrome, UFH unfractionated heparin

Table 3
Univariable and multivariable Cox proportional regression models explore the independent association of subphenotypes with 90-day mortality by including clinically meaningful covariatesMortality risk was reported by hazard ratio with 95% confidence interval.Adjusted models were ranked by their Akaike information criterion (AIC) and their Bayesian information criterion (BIC).N = 549